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We study behavioral action sequences of players in a massive mul- 
tiplayer online game. In their virtual life players use eight basic 
actions which allow them to interact with each other. These ac- 
tions are communication, trade, establishing or breaking friendships 
and enmities, attack, and punishment. We measure the probabilities 
for these actions conditional on previous taken and received actions 
and find a dramatic increase of negative behavior immediately after 
receiving negative actions. Similarly, positive behavior is intensi- 
fied by receiving positive actions. We observe a tendency towards 
anti-persistence in communication sequences. Classifying actions as 
positive (good) and negative (bad) allows us to define binary 'world 
lines' of lives of individuals. Positive and negative actions are per- 
sistent and occur in clusters, indicated by large scaling exponents 
a ~ 0.87 of the mean square displacement of the world lines. For 
all eight action types we find strong signs for high levels of repet- 
itiveness, especially for negative actions. We partition behavioral 
sequences into segments of length n (behavioral 'words' and 'mo- 
tifs') and study their statistical properties. We find two approximate 
power laws in the word ranking distribution, one with an exponent 
of K ~ 1 for the ranks up to 100, and another with a lower expo- 
nent for higher ranks. The Shannon n-tuple redundancy yields large 
values and increases in terms of word length, further underscoring 
the non-trivial statistical properties of behavioral sequences. On the 
collective, societal level the timeseries of particular actions per day 
can be understood by a simple mean-reverting log-normal model. 

Human behavior | Time series analysis | Scaling laws | Quantitative social sci- 
ence I Massive multiplayer online game 

Societies can be seen as individuals interacting through a 
multiplex network (MPN), i.e. a superposition of several 
social networks defined on the same set of nodes (individuals) 
[1,2]. Different types of networks correspond to different types 
of social interactions. For example the communication sub- 
network of the MPN is the network whose links correspond 
to the exchange of information by means of emails, telephone 
calls, or letters. Another subnetwork is the trading network, 
where goods or services are exchanged between individuals, 
in exchange for other goods, money, or -rarely- for nothing. 
Each of these interactions usually needs an initial action taken 
by one of the subjects involved in the exchange, the sender, 
and a target to receive it, the recipient. Actions can (but 
do not have to) be reciprocated, so that in general the MPN 
consists of a set of directed and weighted subnetworks. The 
MPN is a highly non-trivial dynamical object. The differ- 
ent social networks within the MPN are not independent but 
strongly influence each other through a network-network in- 
teraction. To understand systemic properties of societies it is 
essential to detect and quantify the organizational principles 
behind such mutual influences. The MPN is an example of a 
co-evolvmg structure: on one hand the actions of individuals 
shape and define the topological structure of the MPN. On 
the other hand the topology of the MPN constrains and influ- 
ences the possible actions which take place on the MPN. In 
general the MPN of a society can not be observed due to im- 
mense requirements on synchronized data acquisition. Despite 
these difficulties, the analysis of small-scale MPNs has a tra- 
dition in the social sciences [1, 3, 4, 5]. Concerning large-scale 
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Fig. 1. (a) Short segment of action sequences of three players, A 




, and 



A . Some actions of players 146 and 701 are directed toward player 199. This re- 
sults in a sequence of received-actions for player 199, R -^-^ ^ {- ■ ■ ATTCE ■ ■ ■ }. 
The combined sequence of actions (originated from - and directed to) player 199, 
C , is shown in the last line; red letters mark actions from others directed to 
player 199. (b) Schematic illustration showing the definition of a binary walk in 
'good-bad' action space (good-bad 'w/orld line'). A positive action (C, T, F or X) 
means an upward move, a negative action (A, B, D and E) is a downward move. 
Good people have rising w/orld-lines. 



studies, recently there have been significant achievements in 
understanding a number of massive social networks on a quan- 
titative basis, such as the cell phone communication network 
[6, 7, 8], features of the world-trade network [9], email net- 
works [10], the network of financial debt [11] and the net- 
work of financial flows [12]. The integration of various dy- 
namical networks of an entire society has so-far been beyond 
the scope of any realistic data source. However with the in- 
creasing availability of vast amounts of electronic fingerprints 
people leave throughout their lifes, this situation is about to 
change. Online sources are capturing more and more aspects 
of life, boosting our understanding of collective human behav- 
ior [13, 14]. One particular source where complete behavioral 
multiplex data is available on the society level are massive 
multiplayer online games (MMOGs). In MMOGs hundreds of 
thousands of players meet online in a 'virtual life' where their 
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actions can be easily studied [15]. Players have to gain their 
living through economic activity and usually are integrated 
in several types of social networks. In such games commu- 
nication networks, friendship and enmity networks have been 
studied, initially as separated entities [16, 17]. In [2] trading, 
aggression and punishment networks have been added to the 
analysis and first measurements on mutual network-network 
influences were reported. 

In this paper we do not focus on the full MPN but on 
the dynamics (actions) taking place on its nodes. We report 
on the nature of sequences of human behavioral actions in 
a virtual universe of a MMOG. There sequential behavioral 
data is available on the scale of an entire society, which is in 
general impossible to obtain. The unique nique data of the 
online game Pardus [18] allows to unambiguously track all ac- 
tions of all players over long time periods. We focus on the 
stream of eight types of actions which are translated into an 
8-letter alphabet. This code of actions of individual players 
is then analyzed by means of standard timeseries approaches 
as have been used, for example, in DNA sequence analyzes 
[19, 20, 21]. 
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The game. The dataset contains practically all actions of all 
players of the MMOG Pardus since the game went online in 
2004 [18] . Pardus is an open-ended online game with a world- 
wide player base of currently more than 370,000 people. Play- 
ers live in a virtual, futuristic universe in which they interact 
with others in a multitude of ways to achieve their self-posed 
goals [22]. Most players engage in various economic activities 
typically with the (self-posed) goal to accumulate wealth and 
status. Social and economical decisions of players are often 
strongly influenced and driven by social factors such as friend- 
ship, cooperation, and conflict. Confiictual relations may re- 
sult in aggressive acts such as attacks, fights, punishment, or 
even destruction of another player's means of production or 
transportation. The dataset contains longitudinal and rela- 
tional data allowing for a complete and dynamical mapping 
of multiplex relations of the entire virtual society, over 1238 
days. The behavioral data are free of 'interviewer-bias' or lab- 
oratory effects since users are not reminded of their actions 
being logged during playing. The longitudinal aspect of the 
data allows for the analysis of dynamical aspects such as the 
emergence and evolution of network structures. It is possible 
to extract multiple social relationships between a fixed set of 
humans [2]. 

Human behavioral sequences. We consider eight different ac- 
tions every player can execute at any time. These are commu- 
nication (C), trade (T), setting a friendship link (F), removing 
an enemy link (forgiving) (X), attack (A), placing a bounty 
on another player (punishment) (B), removing a friendship 
link (D), and setting an enemy link (E). While C, T, F and X 
can be associated with positive (good) actions. A, B, D and 
E are hostile or negative (bad) actions. We classify communi- 
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Fig. 2. Timeseries of the daily number of (a) trades, (b) attacks, (c) commu- 
nications in the first 1238 days in the game. Clearly a mean reverting tendency of 
three processes can be seen, (d) Simulation of a model timeseries, Eq. (1), with 
p ^= 0.94. We use the values from the Nq timeseries, R = 4000, and standard 
deviation a = 0.12. Compare with the actual Nq in (c). The only free parameter 
in the model is p. Parameters are from Tab. 1. Mean reversion and log-normality 
motivate the model presented in Eq. (1). (e) The distributions of log-increments 
ry of the processes and the model. All follow approximate Gaussian distribution 
functions. 



cation as positive because only a negligible part of communi- 
cation takes place between enemies [17]. Segments of action 
sequences of three players (146, 199 and 701) are shown in the 
first three lines of Fig. 1 (a). 



Table 1. First row: total number of actions by all players (with at least 1000 actions) in the Artemis universe of the Pardus 
game. Further rows: first 4 moments of ryid), the distribution of the log-increments of the Ny processes (see text). 
Approximate log-normaUty is indicated. The large values of kurtosis for T and A result from a few extreme outliers. 
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0.002 
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0.001 
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0.26 
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0.002 
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We consider three types of sequences for any particu- 
lar player. The first is the stream of A'^ consecutive actions 
A^ = {an\n — 1, • • • , A''} which player i performs during his 
'life' in the game. The second sequence is the (time-ordered) 
stream of actions that player i receives from all the other 
players in the game, i.e. all the actions which are directed 
towards player i. We denote by _R* = {r„|n = 1, ■ ■ ■ ,M} 
received-action sequences. Finally, the third sequence is the 
time-ordered combination of player i's actions and received- 
actions, which is a chronological sequence from the elements of 
A'' and R'' in the order of occurrence. The combined sequence 
we denote by C"; its length is M -I- A'^, see also Fig. 1 (a). 
The nth element of one of these series is denoted by A*(n), 
R^{n), or C^{n). We do not consider the actual time between 
two consecutive actions which can range from milliseconds to 
weeks, rather we work in 'action-time'. 

If we assign -1-1 to any positive action C, T, F or X, and 
— 1 to the negative actions A, B, D and E, we can translate 
a sequence A^ into a symbolic binary sequence A5,i„. From 
the cumulative sum of this binary sequence a 'world line' or 
'random walk' for player i can be generated, VKgood-bad(*) = 
"^^n^i ^bin(^)) ^66 Fig. 1 (b). Similarly, we define a binary se- 
quence from the combined sequence C", where we assign -(-1 to 
an executed action and —1 to a received-action. This sequence 
we call Cbini its cumulative sum, WCct-rGc(i) = Sn=i C'bin('^) 
is the 'action-receive' random-walk or world line. Finally, we 
denote the number of actions which occurred during a day in 
the game by Nyid), where d indicates the day and Y stands 
for one of the eight actions. 



Results 

The number of occurrences of the various actions of all players 
over the entire time period is summarized in Tab. 1 (first line) . 
Communication is the most dominant action, followed by at- 
tacks and trading which are each about an order of magnitude 
less frequent. The daily number of all communications, trades 
and attacks, Nc{d), Nrid) and NA{d) is shown in Fig. 2 (a), 
(b) and (c) , respectively. These processes are reverting around 
a mean, Ry- All processes of actions show an approximate 
Gaussian statistic of its log- increments, ri'(d) — log ^ ^d-i) • 
The first 4 moments of the ry series are listed in Tab. 1. 
The relatively large kurtosis for T and A results from a few 
extreme outliers. The distribution of log-increments for the 
Nc, Nt and Na timeseries are shown in Fig. 2 (d). The lines 
are Gaussians for the respective mean and standard deviation 
from Tab. 1. As maybe the simplest mean-reverting model 
with approximate lognormal distributions, we propose 
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where py is the mean reversion coefficient, ^(d) is a realiza- 
tion of a zero mean Gaussian random number with standard 
deviation ay , and Ry is the value to which the process Ny (t) 
reverts to. a is given by the third line in Tab. 1. 

Transition probabilities. With p{Y\Z) we denote the probabil- 
ity that an action of type Y follows an action of type Z in the 
behavioral sequence of a player. Y and Z stand for any of the 
eight actions, executed or received (received is indicated by a 
subscript r). In Fig. 3 (a) the transition probability matrix 
p{Y\Z) is shown. The y axis of the matrix indicates the action 
(or received-action) happening at a time t, the probabilities 
for the actions (or received-actions) that immediately follow 
are given in the corresponding horizontal place. 
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Fig. 3. (a) Transition probabilities p[Y\Z) for actions (and received-actions) 
y at a time t -\- 1, given that a specific action Z was executed or received in the 
previous time-step t. Received-actions are indicated by a subscript r. Normalization 
is such that row/s add up to one. The large values in the diagonal signal that human 
actions are highly clustered or repetitive. Large values for C — >■ Cr and Cr — ^ C 
reveal that communication is a tendentially anti-persistent activity - it is more likely 
to receive a message after one sent a message and vice versa, than to send or to 
receive two consecutive messages, (b) The ratio ,J^, , shows the influence of an 
action Z at a previous time-step ^ on a following action V at a time £ + 1, where 
Y and Z can be positive or negative actions, executed or received (received actions 
are indicated by the subscript r). In brackets, we report the Z-score (significance 
in number of standard deviations) in respect to a sample of 100 randomized versions 
of the dataset. The cases for which the transition probability is significantly higher 
(lower) than expected in uncorrelated sequences are highlighted in red (green). Re- 
ceiving a positive action after performing a positive action is highly overrepresented, 
and vice versa. Performing (receiving) a negative action after performing (receiving) 
another negative one is also highly overrepresented. Performing a negative action has 
no influence on receiving a negative action next. All other combinations are strongly 
underrepresented, for example after performing a negative action it is very unlikely to 
perform a positive action with respect to the uncorrelated case. 



This transition matrix specifies to which extent an action 
or a received action of a player is influenced by the action that 
was done or received at the previous time-step. In fact, if the 
behavioral sequences of players had no correlations, i.e. the 
probability of an action, received or executed, is independent 
of the history of the player's actions, the transition probability 
p{Y\Z) simply is p{Y), i.e. to the probability that an action 
or received action Y occurs in the sequence is determined by 
its relative frequency only. Therefore, deviations of the ratio 
(Y) from 1 indicate correlations in sequences. In Fig. 3 (b) 
we report the values of (y) for actions and received actions 
(received actions are indicated with the subscript r) classified 
only according to their positive (-}-) or negative (-) connota- 
tion. In brackets we report the Z-score with respect to the 
uncorrelated case. We find that the probability to perform a 
good action is significantly higher if at the previous time-step 
a positive action has been received. Similarly, it is more likely 
that a player is the target of a positive action if at the previ- 
ous time-step he executed a positive action. Conversely, it is 
highly unlikely that after a good action, executed or received, 
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Fig. 4. (a) World lines of good-bad action random walks of the 1,758 most active players, (b) distribution of their slopes k and (c) of their scaling exponents a. By 
definition, players who perform more good (bad) than bad (good) actions have the endpoints of their w/orld lines above (below/) in (a) and only fall into the /;: > (/c < 0) 
category in (b). (d) World lines of action-received random w/alks, (e) distribution of their slopes k and (f) of their scaling exponents a. The inset in (d) shows only the w/orld 
lines of bad players. These players are typically dominant, i.e. they perform significantly more actions than they receive. In total the players perform many more good than bad 
actions and are strongly persistent with good as well as with bad behavior, see (c), i.e. actions of the same type are likely to be repeated. 



a player acts negatively or is the target of a negative action. 
Instead, in the case a player acts negatively, it is most likely 
that he will perform another negative action at the follow- 
ing time-step, while it is highly improbable that the following 
action, executed or received, will be positive. Finally, in the 
case a negative action is received, it is likely that another neg- 
ative action will be received at the following time-step, while 
all other possible actions and received actions are underrepre- 
sented. The high statistical significance of the cases P(— | — ) 
and P(— r|— r) hints toward a high persistence of negative ac- 
tions in the players' behavior, see below. 

Another finding is obtained by considering only pairs of 
received actions followed by performed actions. This approach 
allows to quantify the influence of received actions on the per- 
formed actions of players. For these pairs we measure a prob- 
ability of 0.02 of performing a negative action after a received 
positive action. This value is significantly lower compared 
to the probability of 0.10 obtained for randomly reshuffled 
sequences. Similarly, we measure a probability of 0.27 of per- 
forming a negative action after a received negative action. 
Note that this result is not in contrast with the values in 
Fig. 3 (b), since only pairs made up of received actions and 
performed actions are taken into account. 

World lines. The world lines W^good-bad of good-bad action se- 
quences are shown in Fig. 4 (a), the action-reaction world 
lines in Fig. 4 (b). 

As a simple measure to characterize these world lines we 
define the slope k of the line connecting the origin of the 
world line to its end point (last action of the player). A 
slope of fc = 1(— 1) in the good-bad world lines Wgood-bad 
indicates that the player performed only positive (negative) 
actions. The slope fc' is an approximate measure of 'altruism' 
for player i. The histogram of the slopes for all players is 



shown in Fig. 4 (b), separated into good (blue) and bad (red) 
players, i.e. players who have performed more good than bad 
actions and vice versa. The mean and standard deviation of 
slopes of good, bad, and all players are fc^""*^ = 0.81 ± 0.19, 
^bad ^ _Q^Q ^ Q 28^ and fc''" = 0.76 ± 0.31, respectively. 
Simulated random walks with the same probability 0.90 of 
performing a positive action yield a much lower variation, 
^sim _ Q gi-|-o.01, pointing at an inherent heterogeneity of hu- 
man behavior. For the combined action-received-action world 
line Wact-rec the slope is a measure of how well a person is 
integrated in her social environment. If fc = 1 the person only 
acts and receives no input, she is 'isolated' but dominant. If 
the slope is fc = — 1 the person is driven by the actions of 
others and does never act nor react. The histogram of slopes 
for all players is shown in Fig. 4 (e). Most players are well 
within the ±45 degree cone. Mean and standard deviation of 
slopes of good, bad, and all players are fe^°°'^ — 0.02 ± 0.10, 
^bad ^ Q 3Q ^ Q -^g^ ^^^ ^aii ^ Q Q^ ^ Q -^2, respectively. Bad 

players are tendentially dominant, i.e. they perform signif- 
icantly more actions than they receive. Simulated random 
walks with equal probabilities for up and down moves for a 
sample of the same sequence lengths, we find again a much 
narrower distribution with slope fc*"™ = 0.00 ± 0.01. 

As a second measure we use the mean square displacement 
of world lines to quantify the persistence of action sequences, 



M\r) = {AW(r) - {AW{r))f 



[2] 



(see Materials and Methods). The histogram of exponents 
a for the good-bad random walk, separated into good (blue) 
and bad (red) players, is shown in Fig. 4 (c), for the action- 
received-action world line in (f). In the first case strongly 
persistent behavior is obvious, in the second there is a slight 
tendency towards persistence. Mean and standard deviation 
for the good-bad world lines are ctgood-bad ~ 0.87 ± 0.06, for 
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the action-received actions Qact-roc = 0.59 ± 0.10. Simulated 
sequences of random walks have - as expected by definition 
- an exponent of a^d ~ 0.5, again with a very small stan- 
dard deviation of about 0.02. Figure 4 (a) also indicates that 
the lifetime of players who use negative actions frequently is 
short. The average lifetime for players with a slope fe < is 
2528 ± 1856 actions, compared to players with a slope fc > 
with 3909 ± 4559 actions. The average lifetime of the whole 
sample of players is 3849 ± 4484 actions. 

Motifs, Entropy and Zipf law. By considering all the sequences 
of actions A^ of all possible players i, we have an ensemble 
which allows to perform a motif analysis [25]. We define a n- 
string as a subsequence of n contiguous actions. An n-motif is 
an n-string which appears in the sequences with a probability 
higher than expected, after lower-order correlations have been 
properly removed (see Materials and Methods). 

We computed the observed and expected probabilities p°^^ 
and p^^'P for aU 8^ = 64 2-strings and for all 8^ = 512 3-strings, 

focusing on those n-strings with the highest ratio -^-^^jp . Higher 
orders are statistically not feasible due to combinatorial ex- 
plosion. We find that the 2-motifs in the sequences of actions 
A are clusters of same letters: BB, DD, XX, EE, FF, AA with 



ratios 



p(obs) 



169, 



This 



, ^ . ~ ^Kj^, 136, 117, 31, 15, 10, respectively, 
observation is consistent with the previous first-order observa- 
tion that actions cluster. The most significant 3-motifs how- 
ever are (with two exceptions) the palindromes: EAX, DAF, 
DCD, DAD, BGB, BFB, with ratios ^^ « 123, 104, 74, 
62, 33, 32, respectively. The exceptions disappear when one 
considers actions executed on the same screen in the game as 
equivalent, i.e. setting or removing friends or enemies: F, D, 
E, X. This observation hints towards processes where single 
actions of one type tend to disrupt a flow of actions of another 
type. 

Finally, we partition the action sequences into n-strings 
('words'). Fig. 5 shows the rank distribution of word oc- 
currences of different lengths n. The distribution shows an 
approximate Zipf law [24] (slope of k = —1) for ranks below 
100. For ranks between 100 and 25,000 the scaling exponent 
approaches a smaller value of about k ~ —1.5. The Shannon 
n-tuple redundancy (see e.g. [19, 20, 21]) for symbol sequences 
composed of 8 symbols (our action types) is defined as 



R' 



(n) 



^E^"" 



o(n) 



3n 



[3] 



where P/" is the probability of finding a specific n-letter 
word. Uncorrelated sequences yield an equi-distribution. 
Pi = 8~", i.e. i?'"-' = 0. In the other extreme of only one 
letter being used, i?'"' = 1. In Fig. 5 (inset) T?*-"' is shown as 
a function of sequence length n. Shannon redundancy is not a 
constant but increases with n. This indicates that Boltzmann- 
Gibbs entropy might not be an extensive quantity for action 
sequences [23]. 



Discussion 

The analysis of human behavioral sequences as recorded in a 
massive multiplayer online game shows that communication is 
by far the most dominant activity followed by aggression and 
trade. Communication events are about an order of magni- 
tude more frequent than attacks and trading events, showing 
the importance of information exchange between humans. It 
is possible to understand the collective timeseries of human ac- 
tions of a particular type (Ny) with a simple mean-reverting 
log-normal model. On the individual level we are able to iden- 
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Fig. 5. Rank ordered probability distribution of 1 to 6 letter words. Slopes of 
K = — 1 and K = —1.5 are indicated for reference. The inset shows the Shannon 
n-tuple redundancy as a function of word length n. 



tify organizational patterns of the emergence of good overall 
behavior. Transition rates of actions of individuals show that 
positive actions strongly induces positive reactions. Negative 
behavior on the other hand has a high tendency of being re- 
peated instead of being reciprocated, showing the 'propulsive' 
nature of negative actions. However, if we consider only reac- 
tions to negative actions, we find that negative reactions are 
highly overrepresented. The probability of acting out nega- 
tive actions is about 10 times higher if a person received a 
negative action at the previous timestep than if she received 
a positive action. The action of communication is found to 
be of highly reciprocal 'back-and-forth' nature. The analy- 
sis of binary timeseries of players (good-bad) shows that the 
behavior of almost all players is 'good' almost all the time. 
Negative actions are balanced to a large extent by good ones. 
Players with a high fraction of negative actions tend to have 
a significantly shorter life. This may be due to two reasons: 
First because they are hunted down by others and give up 
playing, second because they are unable to maintain a social 
life and quit the game because of loneliness or frustration. We 
interpret these findings as empirical evidence for self organiza- 
tion towards reciprocal, good conduct within a human society. 
Note that the game allows bad behavior in the same way as 
good behavior but the extent of punishment of bad behavior 
is freely decided by the players. 

Behavior is highly persistent in terms of good and bad, as 
seen in the scaling exponent (q ~ 0.87) of the mean square 
displacement of the good-bad world lines. This high persis- 
tence means that good and bad actions are carried out in 
clusters. Similarly high levels of persistence were found in 
a recent study of human behavior [26]. A smaller exponent 
(a ~ 0.59) is found for the action-received-action timeseries. 

Finally we split behavioral sequences of individuals into 
subsequences (of length 1-6) and interpret these as behavioral 
'words'. In the ranking distribution of these words we find 
a Zipf law to about ranks of 100. For less frequent words 
the exponent in the rank distribution approaches a somewhat 
smaller exponent of about k ~ —1.5. From word occurrence 
probabilities we further compute the Shannon n-tuple redun- 
dancy which yields relatively large values when compared for 
example to those of DNA sequences [19, 20, 21]. This re- 
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fleets the dominance of communication over all the other ac- 
tions. The n-tuple redundancy is clearly not a constant, re- 
flecting again the non-trivial statistical structure of behavioral 
sequences. 



Materials and Methods 

The game Pardus [18] is sectioned into three independent 'uni- 
verses'. Here we focus on the 'Artemis' universe, in which we 
recorded player actions over the first 1,238 consecutive days 
of the universe's existence. Communication between any two 
players can take place directly, by using a one-to-one, e-mail- 
like private messaging system, or indirectly, by meeting in 
built-in chat channels or online forums. For the player ac- 
tion sequences analyzed we focus on one-to-one interactions 
between players only, and discard indirect interactions such 
as e.g. participation in chats or forums. Players can express 
their sympathy (distrust) toward other players by establishing 
so-called friendship (enmity) links. These links are only seen 
by the player marking another as a friend (enemy) and the re- 
spective recipient of that link. For more details on the game, 
see [17, 18]. From all sequences of all 34,055 Artemis players 
who performed or received an action at least once within 1,238 
days, we removed players with a life history of less than 1000 
actions, leading to the set of the most active 1,758 players 
which are considered throughout this work. 

All data used in this study is fully anonymized; the authors 
have the written consent to publish from the legal department 
of the Medical University of Vienna. 

Mean square displacement. The mean square displacement 
M^ of a world line W is defined as M^{t) = {AW{r) - 
{AW{T))f, where AW{t) = W{t + t) - W{t) and (.) is the 
average over all t. The asymptotic behavior of M{t) yields in- 



formation about the 'persistence' of a world line. M{t) cx r 2 
is the pure diffusion case, M{t) oc r" with scaling exponent 
a 7^ ^ indicates persistence for a > |, and anti-persistence 
for Of < i . Persistence means that the probability of making 
an up(down) move at time f -I- 1 is larger(less) than p — 1/2, 
if the move at time t was an up move. For calculating the 
exponents a we use a fit range of r between 5 and 100. We 
checked from the mean square displacements of single world 
lines that this fit range is indeed reasonable. 

Motifs. We define n-strings a subsequence of n contiguous 
actions. Across the entire ensemble, 8" different n-strings 
can appear, each of them occurring with a different prob- 
ability. The frequency, or observed probability, of each n- 
string can be compared to its expected probability of occur- 
rence, which can be estimated on the basis of the observed 
probability of lower order strings, i.e. on the frequency of 
(n — l)-strings. For example, the expected probability of oc- 
currence of a 2-string {At,At+i) is estimated as the product 
of the observed probability of the single actions At and At+i, 



„cxp(- 



p-''{At,At+i) = p°''''{At)p°''''{At+i). Similarly, the probabil- 



to occur can be estimated as 

obs / 



ity of a 3-string {At, At+i, At+2) 

p^^'^{At,At+i,At+2) = p°''^At,At+i)p°'''{At+2\At+i), where 
p°^^{At+2\At+i) is the conditional probability to have action 
At+2 following action At+i- By definition of conditional prob- 

ability, one has p°''^ {At+2\At+i) = ''°^lit:^2;+T^ (^^e [25] 
for details) . A n- motif in the ensemble is then defined as a n- 
string whose observed probability of occurrence is significantly 
higher than its expected probability. 
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